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Abstract 

Background: Pectin methylesterases (PMEs) catalyze the demethylesterification of homogalacturonans in the cell 
wall; their activity is regulated in part by pectin methylesterase inhibitors (PMEIs). PME activity may result in either 
rigidification or loosening of the cell wall, depending on the mode of demethylesterification. The activity of PMEs in 
the middle lamella is expected to affect intrusive elongation of phloem fibers, and their adhesion to adjacent cells. 
Length and extractability of phloem fibers are qualities important for their industrial uses in textiles and composites. 
As only three flax PMEs had been previously described, we were motivated to characterize the PME and PMEI gene 
families of flax. 

Results: We identified 105 putative flax PMEs (LuPMEs) and 95 putative PMEIs (LuPMEIs) within the whole-genome 
assembly. We found experimental evidence for the transcription of 77/105 LuPMEs and 83/95 LuPMEIs, and surveyed 
the transcript abundance of these in 12 different tissues and stages of development. Six major monophyletic groups 
of LuPMEs could be defined based on the inferred relationships of flax genes and their presumed orthologs from 
other species. We searched the LuPMEs and LuPMEIs for conserved residues previously reported to be important for 
their tertiary structure and function. In the LuPMEs, the most highly conserved residues were catalytic residues while 
in the LuPMEIs, cysteines forming disulfude bridges between helices a2 and a3 were most highly conserved. In 
general, the conservation of critical residues was higher in the genes with evidence of transcript expression than in 
those for which no expression was detected. 

Conclusions: The LuPMEs and LuPMEIs comprise large families with complex patterns of transcript expression and 
a wide range of physical characteristics. We observed that multiple PMEs and PMEIs are expressed in partially 
overlapping domains, indicative of several genes acting redundantly during most processes. The potential for 
functional redundancy was highlighted also by the phylogenetic analyses. We were able to identify a subset of 
PME and PMEIs that appeared particularly relevant to fiber development, which may provide a basis for the 
improvement of key traits in industrial feedstocks and a better understanding of the physiological roles of PMEs 
and PMEIs in general. 
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Background 

Pectins are complex polysaccharides present in the plant 
cell wall and in the middle lamella and are dynamically 
modified by pectin methylesterases (PMEs). The PME 
gene family was first described by Richard et al [1], and 
later classified in the Carbohydrate Active Enzymes data- 
base (CAZy) as class 8 of the carbohydrate esterases (EC 
3.1.1.11) [2]. In current models, pectins are synthesized 
in the Golgi complex as highly methylesterified polymers 
(e.g. homogalacturonan, HG) that are secreted to the cell 
wall. Once in the cell wall, PMEs catalyze the demethy- 
lesterification of HG, which generates negatively charged 
carboxyl groups. If demethylesterification occurs on 
contiguous sugar residues (i.e. blockwise demethylester- 
ification), Ca^"^ bonds can form between pectin mole- 
cules, thereby rigidifying the cell wall. Conversely, if the 
demethylesterification occurs on non-contiguous sugars 
(i.e. random demethylesterification), the molecule be- 
comes a substrate for pectin degrading enzymes, leading 
to cell wall loosening [3,4]. The activity of the PMEs is 
regulated by pectin methylesterase inhibitors (PMEIs) 
[5], which bind to the active site of the PME, generating 
a 1:1 complex [6,7]. PMEs are classified as either Type-1 
PMEs (i.e. those with a pro-region, similar to the PMEI 
domain), or Type-2 PMEs (no pro-region). In Type-1 
PMEs, the pro-region and the PME domain are trans- 
lated as part of the same protein and then, in the Golgi 
complex, as a pre-requisite for secretion to the cell 
wall, the pro region is removed by a subtilisin-like 
protease [8]. 

The bast (phloem) fibers of flax {Linum usitatissimum) 
are valued industrially for their length and strength. Ex- 
traction of high quality fibers requires retting, a process 
by which stems are exposed to the action of microbes 
that degrade the middle lamella and so, facilitate separ- 
ation of fibers from surrounding tissues. Flax fibers grow 
from the shoot apex intrusively after a very short period 
of coordinated growth [9]. During intrusive elongation, 
fibers first penetrate the middle lamella between adja- 
cent cells, and subsequently generate new contact inter- 
faces. Both of these processes presumably influence fiber 
length and the efficiency of retting, and are dependent 
on the activity of PMEs. 

Different varieties of flax are grown for either fibers or 
for seeds (i.e. linseed) [10]. Although stems of linseed 
varieties contain fiber, these fibers are not harvested, be- 
cause of relatively low fiber yield and the difficulty of 
retting in the environments where linseed is typically 
grown. A better understanding of PMEs is therefore 
important to the development of dual-purpose flax, in 
which both fibers and seeds can be utilized from a single 
variety. 

Three PMEs have been previously characterized in flax: 
LuPMEl, LuPME3, and LuPMES [11-14]. These are all 



Type-1 PMEs. Al-Qsous et al [13] reported that in 2 dpg 
(days past germination) hypocotyls, transcript abundance 
of LuPMES is higher in the apical region, while LuPME3 
transcript abundance is higher in the basal region. Also, 
LuPMES has the highest transcript abundance of the 
three characterized LuPMEs in hypocotyls. The highest 
transcript abundance of LuPME3 in seedlings is in the 
roots [11]. Here, we expand on these studies and present 
an analysis of the complete family of PMEs and PMEIs in 
flax, based on the recently assembled whole genome 
sequence of the linseed variety CDC Bethune [IS]. A 
specific objective of this research is to identify PMEs 
that are expressed during stages of fiber development that 
are likely to influence the industrially relevant properties 
of flax bast fibers. 

Methods 

Annotation of PME and PMEI domain in flax and 
other species 

Predicted proteins that contained PME (PF0109S) and/or 
PMEI (PF04043) domains were identified from the whole 
genome shotgun (WGS) assembly of flax [IS] (version 1.0) 
using default parameters in hmmsearch/PfamScan [16]. 
The predicted proteins from the flax WGS assembly 
were also aligned to previously described PMEs and 
PMEIs from Arabidopsis obtained from TAIR [17], using 
BLASTp. All of the LuPMEs/LuPMEIs that were identi- 
fied by BLASTp to Arabidopsis were also identified by 
HMM-alignment to the PFAM domains. Predicted flax 
proteins that had both a PMEI and PME domain were 
designated Type-1 PMEs, and proteins with a PME do- 
main (but no PMEI domain) were designated Type-2 
PMEs. Genes with questionable PFAM annotations (i.e. 
significant PME and/or PMEI domain but low e-value; 
low coverage of the domain; more than one PME or 
PMEI domain; an extra domain different from PME or 
PMEI), and genes that were adjacent on scaffolds of the 
WGS assembly were manually curated, which included 
reanalysis of their predicted gene structures by submit- 
ting their genomic sequence (i.e. the predicted gene plus 
1000 bp up and downstream) to the Augustus web server 
[18]. The Augustus A, thaliana gene model parameters 
were used for the gene reannotation, in combination with 
any ESTs that aligned to the prediction region (9S% identity 
and 90% coverage) [19] as well as unpublished RNAseq 
reads (http://www.onekp.com/, version April 2S 2013). 

PMEs and PMEIs in other plants 

An hmmsearch using PFAM domains PF0109S (Pectines- 
terase) and PF04043 (PMEI) was conducted with default 
parameters on transcripts deposited in Phytozome (ver- 
sion 9.1). To determine the statistical significance of the 
presence of the PME domain, all the protein sequences 
that had the domain were retrieved, and these were 
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searched again against PFAM using batch search. For 
the putative PMEIs, protein sequences that had a PMEI 
domain but not a PME domain were obtained and then 
searched again on PFAM to estabUsh the statistical sig- 
nificance of the predicted domains and confirm the 
absence of a PME domain. 

Primer design for qRT-PCR 

The Universal ProbeLibrary Assay Design Center (Roche) 
was used to design specific primers and probes for each 
gene. Groups of 10 closely related genes were submitted 
in batches for the design of specific primers and Roche 
UPL probes. The specificity of primers was evaluated by 
BLASTn alignment of the primers against the complete 
predicted transcriptome and the entire genome assembly. 
All primer pairs were designed so at least one primer of 
each pair had three or more mismatches to any off-target 
gene, near the 3' of the primer. For those genes for which 
a specific primer could not be designed, a primer common 
to two PME or PMEI genes was used. The list of primers 
can be accessed in Additional file 1: Table SI. 

Tissues for quantitative real time PGR using a 96.96 
Dynamic array 

RNA was obtained from 12 different tissues from three 
biological replicates. Each biological replicate was assayed 
independently. Five of the tissues/organs (shoot apex (SA), 
leaves (L), roots (R), early cortical peel (ECP), and early 
fibers (EF)) were collected from vegetative stage plants 
1 month after germination; the other seven tissues (senes- 
cent leaves (SL), xylem (X), late cortical peel (LCP), late 
fibers (LF), flower buds (FB), flowers (F), and green bolls 
(B)) were collected from plants 2 months after germin- 
ation, at the green capsule stage. The cortical peel, xylem, 
and fiber tissues were obtained from the first 15 cm of the 
plant from the hypocotyls to the top. The shoot apex 
tissue corresponds to the top 2 cm of the plant. A phenol/ 
chloroform based method was used for extraction of 
RNA, with subsequent treatment with DNAse. 5 ng of 
RNA were used to synthesize the cDNA for the 96.96 
dynamic array (Fluidigm Corporation, CA, USA). The 
cDNA was tested for genomic DNA contamination by 
PGR using a set of primers flanking an intron. 

A total of 12.5 ng of cDNA were used for the pre- 
amplification reaction containing 50 nm of each primer 
pair (a pool consisting of 89 primers for the PME and/or 
PMEI genes plus 3 endogenous controls GAPDH, ETIFl, 
ETIF5A [20], and Ix TaqMan Pre Amp Master Mix 
(Applied Biosystems) in a final volume of 10 [iL, The 
following thermal cycles were followed: 1 cycle: 95°C 
10 min; 14 cycles: 95°C 15 seconds, 60°C 4 min. The 
pre-amplified product was diluted 1:5 and the pre- 
amplification reaction was tested doing a pass/fail test 



with GAPDH endogenous control primers, verifying 
that the Ct value was close to 20. 

Primer and Roche UPL probe mix ("promer") was pre- 
pared by mixing 2 (il of a 20 [xM mix of both primers 
for each 1 [iL of the respective probe (10 [iM stock). The 
"promer" was tested with an equimolar mixture of cDNA 
from all the tissues (except fiber, which are included in the 
cortical peels), if it did not work new primers were 
designed, if the new primers did not work, it was 
presumed to be not expressed and the primers were 
used regardless, and run in the 96.96 dynamic array 
(Fluidigm Corporation, CA, USA). 

Fluidigm 96.96 control line fluid was used to prime 
the fluidics arrays with the 136x chip prime script. Then 
the appropriate inlets were loaded with the different as- 
says and sample mixes. The three biological replicates 
for each tissue were each placed in three different posi- 
tions on the array for three technical replicates each. 

The manufacturer s protocol was followed to prepare 
the assay and sample mixes. The assay inlets contained a 
6.5 [iL assay mixture containing Ix DA assay loading re- 
agent and 2 [iL of the respective "promer" (primer + probe) 
mixture for each inlet. The sample inlets contained Ix 
TaqMan^ Universal PCR Master Mix, No AmpErase^ 
UNG (Applied Biosystems PN 4324018), Ix DA sample 
loading reagent (Fluidigm PN 85000735), and 2.5 [iL of 
the respective preamplified sample. 

Once the samples and assays inlets were loaded, the 
136x load mix script was executed to load the samples 
and assays. Once loaded, the chip was moved to the 
Biomark instrument and the following thermal cycles 
were executed: 1 cycle: 95°C 10 min; 40 cycles: 95°C 
15 seconds, 60°C 1 min. 

Analysis of 96.96 dynamic array results 

Only those wells with a quality score of > 0.65 were used 
in further analyses. The mean of the technical replicates 
was calculated. Then the delta-Cx was obtained by cal- 
culating the geometric mean of the endogenous controls 
for the given tissue/biological-replicate, and subtracting 
that value to the Cx of the gene at that tissue/biological- 
replicate. Subsequently, the mean and the standard error 
of the delta-Cx of the three biological replicates were 
calculated. 



EST and RNAseq data mapping 

The EST and assembled RNAseq data (Deyholos et al, 
manuscript in preparation) were mapped against the PME 
and PMEI CDS sequences, using the read mapping tool, 
on the CLC Genomics Workbench 6.0.1, with the default 
parameters, except for the length fraction (0.8), and the 
similarity fraction (0.9 or 0.98). 
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Signal peptide, transmembrane domain, and protein 
subcellular localization predictions 

SignalP 4.0 was used to search for signal peptides [21] 
Transmembrane domains were predicted using TMHMM 
v.2.0 [22]. The protein subcellular localization was pre- 
dicted using WoLF PSORT and Plant-mPLoc [23,24]. 

Cleavage site prediction 

Proteolytic cleavage sites were predicted using a protease 
recognition pattern described by Pelloux et al and 
Wolf et al [4,8]. The motif [RKQ] [RKEHLN] [LDMI] 
[LMAKR] was searched in the Type-1 PME proteins using 
"Protein Pattern Find" at http://www.bioinformatics.org/. 
These sites were also identified visually on a ClustalW mul- 
tiple alignment of the protein sequences of sequences in 
the same phylogenetic group. This allowed us to confirm 
the motifs found with the web tool, and also to identify pos- 
sible novel cleavage recognition motifs, by comparison of 
the aligned sequences with known motifs. 

Isoelectric point 

The predicted isoelectric point of the complete and the 
mature proteins (i.e. after signal peptide and/or cleavage 
site removal) was calculated using Vector NTI 10 [25]. 

Phylogenetic analysis 

Phylogenetic relationships among the PMEs and PMEIs 
from flax, Manihot esculenta, Ricinus communis, Popu- 
lus trichocarpa, and Arabidopsis thaliana were inferred 
as follows. The PME and PMEI protein sequences from 
M, esculenta, R, communis, P. trichocarpa, and A, thaliana 
were extracted from Phytozome (version 9.1) using the 
PFAM identifiers PF04043 (for PMEI) and PF01095 (for 
PME) in a keyword-based ontology search. Alignments 
for the complete PMEs and PMEIs proteins of these 
four species plus flax, and for the LuPMEs and LuPMEIs 
proteins of flax alone, were constructed using MUSCLE 
[26]. The alignments were used to first determine the 
substitution model that best described the evolutionary 
process of each set of proteins, using ProtTest [27], and 
then these models were used to construct maximum 
likelihood trees using GARLI [28] under the CIPRES 
web interface [29], with 100 bootstraps and 2 search- 
replicates. The result of the analysis in ProtTest showed 
that the model of evolution that best fit the set of genes 
for LuPMEs was WAG + I + G + F, and the same model 
was obtained for the LuPMEIs. For the analysis of the 
PMEs and PMEIs in all the analysed species, the best 
model was WAG + G + F. To estimate the divergence 
time of presumptive paralogs (Ks) we aligned the nucleo- 
tide sequences of the LuPMEs and the LuPMEIs, and 
then we used MEGA5 [30] to determine genetic dis- 
tance, for which we used the Kimura 2-parameter model 
[31] with the pairwise deletion option, and then calculated 



the divergence time using t = K/2r, where t is time, K is 
the genetic distance, and r is the substitution rate, either 
1.5 X 10-8 [32] or 8.1 x 10-9 [33]. 

Conserved residues 

The presence of the most important residues for the 
protein activity was established based on the structural 
analysis done for the PME [7] and PMEI [6,7]. For PME 
11 important residues were searched: six active sites with 
conserved aromatic residues, three catalytic residues, and 
two protein stabilizers. For PMEI, 33 important residues 
were analysed, including residues interacting with the 
active sites of the PME, residues responsible for disul- 
fide bridging, and several residues responsible for main- 
taining the structure of the protein. 

Results and discussion 

Annotation of LuPMEs and LuPMEIs 

We identified 105 putative LuPMEs and 95 putative 
LuPMEIs (Additional file 2) by searching predicted tran- 
scripts of the flax whole-genome assembly (version 1.0) 
[15] for the PFAM domains Pectinesterase (PF01095) and 
PMEI (PF04043) [16]. Independent alignment of the 
Arabidopsis thaliana PME and PMEI families [34] to 
the flax genome did not identify any additional flax genes 
other than those identified by the PFAM domain align- 
ment. Among the predicted LuPMEs, 60 were Type-1 
(i.e. encoding both a PMEI (PF04043) and PME (PF01095) 
domain [3]), and 45 were Type-2 (i.e. encoding a PME 
domain, but no PMEI domain [3]; Figure 1). Only one 
of the genes (LuPME89) contained an additional PFAM 
domain other than a PME or PMEI domain. This was a 
zf-RING_2 domain (PF13639). 

Detection of PMEs and PMEIs in other plants 

To identify putative PMEs and PMEIs in species other 
than flax, primary transcripts in the Phytozome v9.1 
database were searched for the presence of a PME or 
PMEI domain. The number of predicted PMEs and PMEIs 
in each species was compared as a proportion of aU 
proteins predicted for each species (Additional file 3: 
Figure SI). The proportion of PMEs (0.25%) and PMEIs 
(0.22%) in flax was similar to the average proportion in 
the Malpighiales species sampled, i.e. 0.23% and 0.16%, 
respectively. Among the plants analyzed, Mimulus gut- 
tatus, followed by Capsella rubella had the highest propor- 
tion of PMEs 0.30% and 0.29%, respectively. Meanwhile 
C. rubella and Arabidopsis thaliana had the highest 
proportion of PMEIs 0.31% and 0.27%, respectively. The 
proportion of PMEs was diminished significantly in the 
grasses as compared to other angiosperms (t-test p < 0.01). 
On average the angiosperms (not including grasses) had 
0.22% PMEs as a proportion of the total predicted gene 
number, while grasses had 0.12%. 
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Transcript expression profiling 

We examined transcript expression data to determine 
whether each predicted LuPME and LuPMEl was expressed, 
and if so, under what circumstances. Data sources for this 
analysis included qRT-PCR experiments (described below), 
published microarray data (Fenart et al, [35]), published 
flax ESTs (Venglat et al, [36]; NCBI), and unpublished 
lUumina RNAseq read data from the flax shoot apical 
meristem (Deyholos et al, manuscript in preparation), 
and from the developing flax stem (One Thousand Plants 
Consortium, manuscript in preparation). 

qRT-PCR using a 96.96 dynamic array 

We used a Fluidigm 96.96 microfluidic array to conduct 
qRT-PCR on 12 different tissues of flax. Five of the tis- 
sues (shoot apex (SA), leaves (L), roots (R), early cortical 
peel (ECP), and early fibers (EF)) were collected from 
vegetative stage plants 1 month after germination; the 
other seven tissues (senescent leaves (SL), xylem (X), late 
cortical peel (LCP), late fibers (LF), flower buds (FB), 
flowers (F), and green bolls (B)) were collected from plants 
2 months after germination, at the green capsule stage. We 
were able to design gene-specific primers for 102 out of 
the 105 predicted PMEs. Transcripts of 62/102 PME 
genes (60.8%) were detected in at least one of the tissues 
(Figure 2A), with a minimum Fluidigm 0.65 quality score 
and in at least 2 out of 3 biological replicates. 40/102 pre- 
dicted PMEs (39.2%) could not be detected in any of the 
tissues assayed by qRT-PCR. However, transcripts of 6/40 
of these genes could be found among public ESTs collec- 
tions (80% coverage and 98% identity), and an additional 
8/40 genes could be identified among reads from RNAseq 
transcript profiling experiments (manuscript in prepar- 
ation) (Tables 1 and 2). Finally, one of the three predicted 
PMEs for which qRT-PCR was not attempted was also 
detected by RNAseq. The unigenes assembled by Fenart 
et al, [35] from ESTs of a flax fiber variety were also 
queried. Out of the 20 unigenes (16 PMEs and 4 PMEIs) 
that mapped with 80% coverage and 98% identity to 
the predicted LuPMEs listed in Tables 1 and 2, 14 PMEs 



and 3 PMEIs were detected within the Fluidigm qRT-PCR 
array, while all were identified by either the EST collec- 
tions or the RNAseq data. In total we were therefore 
able to find experimental evidence for the transcription 
of 77/105 predicted PMEs (Tables 1 and 2). 

We used the same Fluidigm qRT-PCR array system 
to assay transcription of 94 out of 95 predicted PMEIs. 
66/94 genes (70.2%) were detected in one or more tissues 
(Figure 2B), and 28/94 (29.8%) PMEIs were not detected 
in any of the tissues. However, 17/29 of the predicted 
PMEIs that were not detected or assayed by qRT-PCR 
were identified among either public ESTs collections or 
in RNAseq data from developing stems (Table 3). To- 
gether, these data provide evidence that at least 83/95 
(87.4%) of the predicted PMEIs are transcribed. 

Using qRT-PCR and by querying previously published 
and unpublished transcript databases, we were able to 
confirm that 77/105 and 83/95 of the predicted LuPMEs 
and LuPMEIs, respectively, are transcribed. The remaining 
genes might also be transcribed but under conditions 
different from those assayed to date. We note, for ex- 
ample, that none of the tissues surveyed to date are from 
plants subjected to stress, which is likely to induce PMEs 
that may not be otherwise transcribed. Among genes 
that are known to be transcribed, we found transcripts 
expressed in fibers and fiber bearing tissues, during either 
elongation (7 PMEs, 3 PMEIs), thickening (16 PMEs, 10 
PMEIs), or maturation and thickening (19 PMEs, 24 PMEIs) 
(Table 4). These genes are primary targets for manipula- 
tion by reverse genetics, in order to develop flax feed- 
stocks with modified fiber properties. 

Transcript expression patterns 

The tissue in which the largest number of expressed 
PMEs (48/62; 77.4%) was detected was the flower bud. 
Conversely, the tissue in which the lowest number of PMEs 
was detected was senescent leaves (9/62; 14.5%). The highest 
number of PMEIs detected was also in a reproductive 
tissue (flowers; 53/66 (80.3%). Conversely, the tissue with 
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Figure 2 Heat map of transcript abundance of PMEs and PMEIs in different tissues. PMEs are shown in (A) and PMEIs are shown in (B). 
Deita-Cj (Ct of gene minus Q of geometric mean of the endogenous controls). The color of the cell represents transcript abundance. Gray cells 
indicate no transcripts were detected. When two different genes appear in the same row it means one set of primers was used as a common 
assay for both genes. FB: Flower buds; F: Flowers; B: Green boll; ECP: Early cortical peels; EF: Early fibers; LF: Late fibers; SA: Shoot apical meristem; 
X: Xylem; R: Roots; L: Leaves; SL: Senescent leaves. 



the fewest detectable PMEIs was early fibers, with only 
20/66 (303%) (Figure 3). 

We also identified PMEs and PMEIs whose transcript 
abundance was correlated with phloem fiber development. 
The transcript expression of 11 PMEs and 20 PMEIs was 
detected in EF, while 15 PMEs and 26 PMEIs were 
expressed in LF (Figure 2). 15 PMEI were expressed in 



both EF and LF while only one PME was expressed in 
both of these stages. Nine PMEs and five PMEIs were 
detected in EF but not LF, and conversely 13 PME and 
12 PMEI were detected in LF and not EF. In general 
there were more PMEIs expressed in the fibers. Specific- 
ally there were more PMEIs expressed in the LF than in 
the EF, which might indicate that the inhibitory activity 
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Table 1 Major features of type-1 LuPMEs (Continued) 



96 


327 


36.6 


10.22 


RKVL 


154 


n, cy, pi, ct_pl, ch, ct 


cw 


6 


2 


3 


D 








97 


402 


45.1 


10.48 


RRVL 


228 


+ + V, g, n, pi, ex 


cw 


6 


2 


3 


D 








99 


324 


36.2 


8.24 RRLL 


RRML 


277-301 


+ cy, er, ch, pi, v 


cw 


5 


2 


3 


C 


+ 


F 


10 



LuPMES, LuPMESI, LuPME97, and LuPME22 have two PME domains. (SP): Presence of signal peptide; (TM): Presence of transmembrane domain. Subcellular localization: ch: chloroplast; cw: cell wall; cy: cytosol; er: g_ 

endoplasmic reticulum; ex: extracellular/cell wall; g: Golgi apparatus; m: mitochondria; n: nuclear; pi: plasma membrane; v: vacuolar membrane; pe: peroxisome; ct: cytoskeleton. (AS): Number of conserved residues at ro 

active site (out of 6); (ST): Number of conserved stabilizer residues (out of 2). CR: Number of conserved catalytic residues (out of 3). Expression: A gene was reported as positive if the coverage with the EST or ^ ' 

assembled RNAseq sequence was higher than 80% and the identity higher than 98%. (OE) Overall transcript expression based on all the methods assessed: (+) expression was detected with at least one of the 

methods; (-) expression not detected in any of the methods. (FL) Gene expression based on qRT-PCR (Fluidigm): (+) it was expressed in at least one tissue; (-) no expression; (NA) no assay done. (UF): Number of S 

genes aligning with unigenes reported by Fenart et al. [35], 80% coverage and 98%identity. (EV) Venglat et al. [36] ESTs, 80% coverage and 98%identity: (F) flower; (S) stem; (SP) stem peel; (ES) etiolated seedling; (L) §^ 

leaf; (GE) globular embryo; (HE) heart embryo; (TE) torpedo embryo; (ME) mature embryo; (TS) torpedo seed coat; (GS) globular seed coat; (EP) endosperm pooled. (EN) Number of ESTs from NCBI as of April 2013. (RS): 

Alignment with RNAseq assembled sequences data (80% coverage and 98%identity) obtained from one kp project at different positions in the stem: (SA) shoot apaical meristem; (S3): stem 3; (S4) stem 4. (RSA): ^ 

RNAseq shoot apical meristem. !^ 
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LuPMES, LuPMESI, LuPIVlE97, and LuPIVlE22 have two PME domains. (SP): Presence of signal peptide; (TM): Presence of transmembrane domain. Subcellular localization: ch: chloroplast; cw: cell wall; cy: cytosol; er: 
endoplasmic reticulum; ex: extracellular/cell wall; g: Golgi apparatus; m: mitochondria; n: nuclear; pi: plasma membrane; v: vacuolar membrane; pe: peroxisome; ct: cytoskeleton. (AS): Number of conserved residues at 
active site (out of 6); (ST): Number of conserved stabilizer residues (out of 2). CR: Number of conserved catalytic residues (out of 3). Expression: A gene was reported as positive if the coverage with the EST or 
assembled RNAseq sequence was higher than 80% and the identity higher than 98%. (OE) Overall expression based on all the methods assessed: (+) expression was detected with at least one of the methods; (-) 
expression not detected in any of the methods. (FL) Gene expression based on qRT-PCR (Fluidigm): (+) it was expressed in at least one tissue; (-) no expression; (NA) no assay done. (UF): Number of genes aligning 
with unigenes reported by Fenart et al. [35], 80% coverage and 98%identity. (EV) Venglat et al. [36] ESTs, 80% coverage and 98%identity: (F) flower; (S) stem; (SP) stem peel; (ES) etiolated seedling; (L) leaf; (GE) globular 
embryo; (HE) heart embryo; (TE) torpedo embryo; (ME) mature embryo; (TS) torpedo seed coat; (GS) globular seed coat; (EP) endosperm pooled. (EN) Number of ESTs from NCBI as of April 2013. (RS): Alignment with 
RNAseq assembled sequences data (80% coverage and 98%identity) obtained from the OneKP project at different positions in the stem: (SA) shoot apical meristem; (S3): stem 3; (S4) stem 4. (RSA): RNAseq shoot 
apical meristem. 
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Table 3 Important features of LuPMEIs (Continued) 
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Table 3 Important features of LuPMEIs (Continued) 



61 


468 


51.7 


8.7 


- 


- 


n, ch, cy 


pi, n 


62 


118 


13.2 


6.9 


- 


+ 


ch, cy, n, ex 


pl 


63 


170 


18.3 


5.9 


+ 


- 


ch, V, ex, m 


Pl 


64 


177 


19.1 


10 


+ 


- 


ch 


pl 


65 


180 


19.1 


5.3 


+ 


- 


ch, m 


pl 


66 


160 


17.8 


7.8 


+ 


- 


ex, ch, n, cy 


pl 


67 


174 


18.8 


4.6 


+ 


- 


ex, ch 


pl 


68 


177 


19 


5.5 


+ 


- 


ch, ex, V, m 


pl 


69 


179 


19.3 


9.2 


+ 


- 


ch 


pl 


70 


177 


19.2 


10.1 


+ 


- 


ch 


pl 


71 


183 


19.6 


5.7 


+ 


- 


ch, m 


pl 


72 


183 


20 


9.4 


+ 


- 


ch, n 


pl 


73 


212 


22.6 


7.7 


+ 


- 


ex, m, V 


pl 


74 


159 


17.4 


5.5 


+ 


- 


ch, ex, n, cy, m 


pl 


75 


170 


19.1 


9 


+ 


- 


ch, pi 


pl, n 


76 


228 


25 


9 


+ 


- 


ch, V, ex 


n 


77 


220 


23 


4.6 


- 


- 


ch, n 


pl 


78 


163 


17.7 


5.7 


+ 


- 


pi, g, ex, ch 


pl 


79 


155 


16.6 


6.9 


+ 


+ 


er, ex, er_pl, ch, m, pi, cy 


pl 


80 


229 


24.9 


5.7 


+ 


- 


ch, ex 


pl, n 


81 


184 


19.3 


4.8 


- 


- 


ch, m, n 


pl 


82 


334 


34.4 


4.4 


+ 


+ 


ex, ch, V, n 


n 


83 


249 


26.9 


4.7 


+ 


- 


ch, ex, V 


pl, n 


84 


193 


20.5 


4.7 


+ 


- 


ex, ch, V 


pl 


85 


156 


16.9 


8.4 


+ 


- 


ex, V, pi 


pl 


86 


150 


16.2 


9.2 


+ 


+ 


V, ex, pi 


pl 


87 


162 


17.7 


9 


+ 




ch 


pl 


88 


172 


18.3 


5.4 


+ 


+ 


ch 


pl 


89 


155 


16.6 


8.8 


+ 


+ 


ex, pi, V, ch, cy, m 


pl 


90 


188 


19.9 


5.7 






ch, m, cy 


pUy 



7 


4 


3 


3 


1 


3 


G 


+ 






0 


All 


+ 


4 


1 


3 


2 


3 


2 


- 


- 


- 




0 






7 


5 


4 


2 


2 


4 


V 


+ 


+ 


TE 


1 




+ 


7 


4 


4 


3 


1 


4 


1 


+ 


+ 




0 


SA 


+ 


6 


5 


4 


3 


1 


4 


1 


+ 


+ 


1 


0 


all 


+ 


7 


3 


4 


0 


2 


4 


T 


+ 


+ 


TS, EP 


4 


SA 




11 


5 


4 


3 


1 


4 


K 


+ 


+ 




0 






7 


5 


4 


3 


2 


4 


V 


+ 


+ 




0 




+ 


7 


5 


4 


3 


2 


4 


1 


+ 


- 




0 




+ 


7 


4 


4 


3 


1 


4 


1 


+ 


- 




0 


All 


+ 


6 


5 


4 


3 


1 


4 


1 


+ 


+ 




0 




+ 


7 


5 


4 


1 


1 


4 


1 


+ 


+ 




0 




+ 


7 


5 


4 


2 


1 


4 


1 


+ 


+ 




0 


SA and S4 


+ 


6 


3 


4 


3 


1 


3 


1 


+ 


+ 




0 


S3 and 4 


+ 


7 


5 


3 


1 


2 


4 


D 


+ 


+ 




0 


S4 


+ 


9 


3 


2 


0 


0 


4 


P 


+ 


+ 


F 


1 






9 


3 


3 


1 


1 


4 


P 


+ 


+ 




0 






11 


4 


2 


1 


1 


4 


K 


+ 


- 




0 


SA and S3 




10 


4 


3 


1 


0 


4 


T 


+ 


- 




0 


S3 and 4 




7 


4 


3 


2 


2 


4 


A 


- 


- 




0 






5 


2 


4 


2 


1 


3 


P 


+ 




1 


0 


S3 


+ 


8 


4 


2 


1 


2 


4 


P 


+ 


+ 




0 






7 


4 


3 


1 


2 


4 


A 


+ 


+ 




0 






7 


5 


4 


2 


1 


4 


V 


+ 


+ 




0 


S3 


+ 


8 


3 


3 


0 


0 


4 


R 


+ 


+ 




0 


All 


+ 


10 


2 


4 


0 


1 


4 


A 


+ 


+ 




0 


All 


+ 


7 


5 


4 


3 


1 


4 


1 


+ 


+ 




0 




+ 


7 


5 


4 


3 


1 


4 


1 


+ 


+ 


TS 


1 


All 


+ 


10 


2 


4 


0 


1 


4 


A 


+ 


+ 




0 


S3 and 4 


+ 


6 


3 


3 


2 


1 


2 


P 


+ 


+ 


F 


1 




+ 



3- -O 

ID N 

^ 0^ 



5 O 

cr ^ 

o" ^ 

3 ^ 

§ - 

^ IT 

QJ_ o_ 

Q o 

O to 

11 

— » o 

ON 2 



o 



Table 3 Important features of LuPMEIs (Continued) 
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(SP): presence of signal peptide; (TM): presence of transmembrane domain. Subcellular localization: ch: chloroplast; cw: cell wall; cy: cytosol; er: endoplasmic reticulum; ex: extracellular/cell wall; g: Golgi apparatus; m: 
mitochondria; n: nuclear; pl: plasma membrane; v: vacuolar membrane; pe: peroxisome; ct: cytoskeleton. Number of conserved residues at each one of the following: (*) non polar bundle-hairpin interface (out of 12); 
(§) polar bundle-hairpin interface (out of 5); (□) Polar interacting with PME aromatic residues (out of 5); (£2) acidic patch on a2 helix (out of 3); (£3) acidic patch on a3 helix (out of 3); (¥) disulphide bridge (out of 4). 
Expression: A gene was reported as positive if the coverage with the EST or assembled RNAseq sequence was higher than 80% and the identity higher than 98%. (OE) Overall expression based on all the methods 
assessed: (+) expression was detected with at least one of the methods; (-) expression not detected in any of the methods. (FL) Gene expression based on qRT-PCR (Fluidigm): (+) it was expressed in at least one tissue; 
(-) no expression; (NA) no assay done. (UF): Number of genes aligning with unigenes reported by Fenart et al. [35], 80% coverage and 98% identity. (EV) Venglat et al. [36] ESTs, 80% coverage and 98%identity: (F) 
flower; (S) stem; (SP) stem peel; (ES) etiolated seedling; (L) leaf; (GE) globular embryo; (HE) heart embryo; (TE) torpedo embryo; (ME) mature embryo; (TS) torpedo seed coat; (GS) globular seed coat; (EP) endosperm 
pooled. (EN) Number of ESTs from NCBI as of April 2013. (RS): Alignment with RNAseq assembled sequences data (80% coverage and 98%identity) obtained from one kp project at different positions in the stem: (SA) 
shoot apical meristem; (S3): stem 3; (S4) stem 4. (RSA): RNAseq shoot apical meristem. 
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Table 4 Genes putatively associated with fiber Table 4 Genes putatively associated with fiber 

development during elongation, thickening, and development during elongation, thickening, and 

maturation processes maturation processes (Continued) 





Gene 


Expression pattern 


Homologous to 


§ 


LuPMEIl 7 


LC-EC 


§ 


LuPMEl 


L-E 


AtPME35 


§ 


LuPMEI19 


LC-EC 


§ 


LuPMES 


0-X; L-E 


PttPMEl 


§ 


LuPMEI23 


0-X 


§ 


LuPMES 


L-E 




§ 


LuPMEI24 


LC-EC 


¥ 


LuPME7 


0-X; E-L 


PttPMEl 


§ 


LuPMEI25 


LC-EC 


¥ 


LuPMES 


E-L 






LuPMEI27 


SA-S 


§ 


LuPME9 


L-E 




§ 


LuPMEI2S 


LC-EC 


¥ 


LuPMElO 


0-X; E-L; EC-LC 




¥ 


LuPMEI29 


EC-LC 


¥ 


LuPMEl 7 


0-X; E-L; EC-LC 




§ 


LuPMEBS 


L-E 


¥ 


LuPMElS 


0-X; E-L 




§ 


LuPMEI40 


0-X 




LuPME21 


SA-S 




§ 


LuPMEI41 


0-X; L-E; LC-EC 


§ 


LuPME24 


L-E 




¥ 


LuPMEI43 


EC-LC 


¥ 


LuPME27 


0-X; EC-LC 




¥ 


LuPMEI44 


E-L 


§ 


LuPMESl 


0-X; L-E 






LuPMEI45 


SA-S 


§ 


LuPME32 


OV 




§ 


LuPMEISI 


L-E; LC-EC 


¥ 


LuPME34 and LuPME59 


E-L 




¥ 


LuPMEI54 


0-X; EC-LC 


¥ 


LuPME39 


0-X; E-L 




§ 


LuPMEISS 


OV 


§ 


LuPME40 


0-X; L-E 




§ 


LuPMEI56 


0-X; L-E 


¥ 


LuPME43 


0-X; EC-LC 






LuPMEI59 


SA-S 


¥ 


LuPME44 


0-X; E-L 




§ 


LuPMEI63 and LuPMEI6S 


0-X; L-E 




LuPME46 


SA-S 




¥ 


LuPMEI65 


0-X; E-L; EC-LC 


§ 


LuPME48 


0-X; L-E 




¥ 


LuPMEI66 


E-L 


¥ 


LuPMESO and LuPME73 


E-L 




§ 


LuPMEI6S 


0-X 


§ 


LuPME53 


L-E 




§ 


LuPMEI71 


0-X 


§ 


LuPME54 


0-X; L-E 




¥ 


LuPMEI72 


EC-LC 


¥ 


LuPME55 


EC-LC 




§ 


LuPMEI74 


L-E 


§ 


LuPME56 


0-X; L-E 




¥ 


LuPMEI76 


EC-LC 


§ 


LuPMLol 


0-X; LC-EC 


AtPML35 


§ 


LuPMEIS2 


L-E 


§ 


LuPME66 


L-E; LC-EC 




§ 


LuPMEISS 


OV 




LuPME67 


SA-S 




¥ 


LuPMEISS 


0-X; E-L 


§ 


LuPME68 


0-X; L-E 




§ 


LuPMEIS6 


L-E 


§ 


LuPME69 


0-X 




§ 


LuPMEIS7 


0-X 




LuPME70 


SA-S 




¥ 


LuPMEIS9 


E-L 


§ 


LuPME71 


OV; 0-X; LC-EC; LCP 




§ 


LuPMEI90 


0-X; L-E; LC-EC 


¥ 


LuPME73 


0-X; EC-LC 




§ 


LuPMEI94 


L-E 




LuPME79 


SA-S 




§ 


LuPMEI95 


L-E 


§ 


LuPMESl 


0-X 




Genes putatively associated with specific fiber development processes are marked 


¥ 


LuPMESS 


0-X; EC-LC 




by symbols: elongation (*), thickening (¥), and thickening and maturation (§). If 








two genes are shown in one line it means the primers used amplified both genes. 


§ 


LuPME91 


L-E 




(OV) expressed only in vascular tissues; (SA-S) present in SA but not in the rest of 




LuPME92 


SA-S 


PttPMEl 


the stem (EF, LF, ECP, LCP); (0-X) present in outer tissues (cortical peel and fibers) 




but not xylem; (L-E) present in LF but not EF; (E-L) present in EF but not LF; (EC-LC) 




LuPME102 


OV; SA-S 




present in ECP but not LCP; (LC-EC) present in LCP but not ECP; (SA) only present 








in SA; (LCP) only present in LCP. The genes underlined were found to be "SA-S" 


¥ 


LuPME103 


0-X; EC-LC 




using the qRT-PCR data, however, according to RNAseq assembled reads they are 


¥ 


LuPMEl 05 


0-X; EC-LC 




present in other parts of the stem in addition to SA. As they were only detected in 
SA with qRT-PCR it is expected that their expression is higher in the SA. 


§ 


LuPMEIl 


0-X; LC-EC 
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Figure 3 Percentage of genes detected by qRT-PCR per tissue. 

The percentage was calculated based on those genes that showed 
transcript expression in at least one tissue. 



of the PMEIs is low at early stages of fiber development 
(i.e. EF stage), when fibers actively synthesize secondary 
cell walls, and demethylesterification of the newly syn- 
thesized homogalacturonan is required. Fiowever, when 
the cell wall deposition ceases, in the late fiber stage, 
PMEIs are expressed, and so the PME activity dimin- 
ishes. Seven PMEs and three PMEI were expressed in 
the shoot apex (SA), but not in any other of the stem 
vascular tissues. Moreover, nine PMEs and six PMEIs 
were expressed in the early cortical (ECP) peel, but not 
the late cortical peel (LCP), and three PMEs and nine 
PMEIs were expressed in late cortical peel but not early 
cortical peel (Figures 4A and 5A). 13 PMEs and 14 PMEIs 
were found only in reproductive tissues; and three PMEs 
and two PMEI were found only in vascular tissues 
(Figures 4B and 5B). Seven PMEs and six PMEIs showed 
specific transcript expression in only one tissue/organ; 
these transcripts were detected in flower buds (four PMEs 
and two PMEIs), flowers (one PMEI), bolls (one of each), 
xylem (one PMEI), roots (one PMEI), late cortical peel 
(one PME), and shoot apex (one PME). Two of these 
might be important for phloem fiber development: 
LuPME71, which was detected only in LCP, a fiber 
containing tissue where secondary cell wall deposition 



and maturation is taking place, and LuPME102, only 
detected in the SA, where intrusive growth takes place 
(Figure 2A-B). 

Previous publications have reported transcript expres- 
sion patterns of specific LuPMEs. Al-Qsous et al, [13] 
found that the transcript abundance of LuPMES is higher 
than either LuPMEl or LuPME3 in hypocotyls. Our 
results were consistent with these observations: The 
calculated delta-Cx of LuPMES was higher than LuPMEl 
or LuPME3 in all the tissues tested (Figure 2). Our tran- 
script abundance data also showed that LuPMES was 
expressed in the shoot apex, while LuPME3 was not, 
which could be correlated with the findings that showed 
that LuPMES transcript abundance was higher in the 
upper parts of the hypocotyl after two days of growth, 
while LuPME3 was higher in the bottom of the hypo- 
cotyl [13]. Mareck et al, [14] found a very high tran- 
script abundance of LuPME3 in roots, as observed with 
the promoter fusion in tobacco [11], in which a GUS 
construct using LuPME3 promoter was used to detect 
its expression in stems, roots and leaves. The expression 
was observed in the vascular tissues of roots, shoots and 
young leaves. This correlates with our results, as we 
detected LuPME3 transcript expression only in roots, 
late fibers and the boll. Our study used mature leaves, 
rather than young leaves, which may explain why we 
failed to detect transcript expression in this tissue, in 
contrast to Mareck et al 

Protein subcellular localization 

To be secreted to the cell wall via the Golgi apparatus 
and secretory pathway, PMEs and PMEIs require an 
N-terminal signal peptide or a transmembrane domain 
[4]. As shown in Figure 6, we found that 71/lOS LuPMEs 
had a predicted transmembrane domain and/or signal 
peptide, and 81/9S LuPMEIs had a transmembrane 
domain and/or signal peptide (Tables 1, 2 and 3). To 
further investigate subcellular localization, we used WoLF 
PSORT and Plant-mPLoc [23,24]. Using Wolf PSORT we 
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Figure 4 LuPME transcript expression in various tissues. Venn diagram showing the number of LuPMEs detected in phloem-fiber containing tissues 
(A) and in tissue systems (B) ECP: early cortical peels. LCP: late cortical peels. EF: early fibers. LP: late fibers. RT: reproductive tissues. VS: Vascular 
tissues at shoot. R: Root . LSL: Leaves and senescent leaves. 
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ECP 




Figure 5 LuPMEl transcript expression in various tissues. Venn diagram showing tine number of LuPMEIs detected in pliloem-fiber containing 
tissues (A) and in tissue systems (B) ECP: early cortical peels. LCP: late cortical peels. EF: early fibers. LP: late fibers. RT: reproductive tissues. 
VS: Vascular tissues at shoot. R: Root . LSL: Leaves and senescent leaves. 



found 56/105 LuPMEs, and 71/95 LuPMEIs that were 
predicted to be secreted to either the cell wall or to 
the plasma membrane. Plant-mPLoc predicted 104/105 
LuPMEs to be secreted to the cell wall, and 88/95 
LuPMEIs were predicted to be secreted to the plasma 
membrane or the cell wall. In total all the LuPMEs 
and 93/95 LuPMEIs were predicted to be extracellular, 
based on protein subcellular localization software pre- 
diction and/or the presence of signal peptide and/or 
transmembrane domain. The two LuPMEIs, LuPMEIll 
and LuPMEI46 that were not predicted to be extracel- 
lular were predicted by both software tools to be local- 
ized to the nucleus. 

Cleavage site 

During their maturation, most Type-1 PMEs are proteo- 
lytically cleaved at one of two possible sites between the 
PMEI domain and the PME domain, before exiting the 
Golgi apparatus. These sites are designated binding 
motif 1 (BMl), and binding motif 2 (BM2) (Figure 1), 
and are separated by between 11-32 amino acids in 
Arabidopsis [8]. A conserved cleavage site consisting of 
four residues with the pattern [RKQ] [RKEHLN] [LDMI] 
[LMAKR] was previously defined by analysis of A. thaliana 
PMEs [4,8]. We identified this pattern at a single site 



in each of 25/60 of the predicted flax Type-1 PMEs. 
Moreover, 19/60 flax Type-1 PMEs had two sites that 
matched the previously defined pattern, and these were 
separated by between 14 and 33 residues. In 3/60 of the 
LuPMEs, a novel tetrapeptide motif (RRKL or GRLL) was 
found in the place of the conserved pattern in BM2. Other 
novel motifs were also found, but these were all accom- 
panied by a conserved motif in the other binding site, 
we found RKVA and RRLW in BMl, and REYL and 
RRFL in BM2. In 13/60 Type-1 PMEs, a cleavage site 
motif was not found (Table 1). Figure 7 shows the distribu- 
tion of sizes of the mature proteins of PMEs and PMEIs 
(after signal peptide and/or pro-region removal, if present). 

There was no significant difference (Fisher s Exact Test 
p > 0.05) in the distribution of the cleavage site features 
between the 45/60 predicted PMEs for which evidence 
of transcription has been found, and the 15/60 predicted 
PMEs for which no evidence has been found. The in- 
ability to cleave out the PMEI domain (pro-region) 
would presumably prevent the export of the PMEs to 
the cell wall, according to Wolf et al [8], who showed 
that unprocessed Type-1 PMEs are retained in the Golgi 
apparatus. Nevertheless, LuPME5 was detected in both 
the unprocessed and processed forms in cell walls of 
flax cell cultures [13], and LuPME3 was only detected in 




Figure 6 Subcellular localization of PMEs and PMEIs. Number of PMEs and PMEIs with transmembrane domains, signal peptides, and predicted to 
be secreted to the cell wall or plasma membrane using WolF PSORT and Plant-mPLoc. 

V . ) 
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Type-1 PME Type-2 PME PMEI 

Figure 7 Size of mature proteins. The signal peptide and/or cleavage 
site, if present, were removed. 



the unprocessed form in flax seedlings and callus [14]. 
This raises the possibility that the processing of Type-1 
LuPMEs might be dispensable for the proper function- 
ing or at least localization of the protein. 

Isoelectric point 

The isoelectric point is one of the factors that influences 
the action of PMEs (i.e. random, acidic pi, or blockwise, 
alkaline pi [37]) and so can facilitate either stiffening or 
loosening of the cell wall. Consequently, the prediction 
of the pi of the proteins contributes to the definition of 
their physiological role in the plant. We calculated the 
pi for the mature proteins (i.e. with any signal peptide or 
pro-region removed) (Figure 8). Most of the PMEIs (51) 
had an acidic pi, while 26 had a basic pi (pH x: 6.48, 
a^:1.77.). On the other hand, most of the PMEs (70) had 
a basic (above pH8.0) pi (pH x: 8.26, o^: 1.46.), while 
just a few (13) had an acidic pi, below 6: Out of these, 
four showed tissue specificity: Two in FB: LuPME93 and 
LuPME37, and two in fiber containing tissues: LCP 
(LuPME71), and SA (LuPME102). As an acidic pi would 
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Figure 8 Isoelectric point of mature proteins. The signal peptide 
and/or cleavage site, if present, were removed. 



lead to random demethylesterification, which ultimately 
could lead to ceU waU loosening; it could be expected 
that when LuPME71 is expressed in the LCP, it de- 
creases the rigidity of the connections between fibers, 
while LuPME102 might enhance fiber growth as it could 
loosen the connections between parenchyma cells to 
facilitate fiber intrusive growth. Only one of the genes 
whose transcript expression was detected in the EF or 
LF had an alkaline pi, this was LuPME66 (pi 5.87), 
which was expressed in reproductive tissues, xylem, LCP 
and LF. This could have a role similar to the described 
for LuPME71. 

The wide range of predicted isoelectric points for 
the mature PMEs (pH 4.75 to 11.25; Tables 1 and 2) is 
consistent with previous reports from flax. Gaffe et aL 
[38] tested the PME activity from ceU waU of flax calli; 
they found isoforms at pi 5.5, 7, 7.3, 7.8, 8.8, and 10. 
Mareck et al, [39] found a similar range of isoforms 
in flax calli; they found PMEs with pis 4.3, 4.8, 6.3, 
7.1, 7.6, and 9.6. Alexandre et al [40] found 2 PMEs 
in hypocotyls, at pH 8 to 9, and at pH 9.5 to 10. Al-Qsous 
et al [13] observed PME activity in the hypocotyls at 5 
different isoelectric points, from pH 7.0 to 10.0. Finally, 
Mareck et al [14] also found a similar pattern of the 
isoforms in epicotyls, cotyledons, hypocotyls and roots; 
they found two neutral, four basic and one strongly 
basic PME isoform. 

Phylogenetic analysis 

To classify the predicted LuPMEs and LuPMEIs on the 
basis of amino acid sequence similarity and inferred 
evolutionary relationships, we aligned their amino acid 
sequences with predicted PMEs and PMEIs from four 
other angiosperms: cassava {Manihot esculenta), castor 
{Ricinus communis), poplar {Populus trichocarpa), and 
A, thaliana. These species were chosen because Ara- 
bidopsis is a well-characterized model organism, and 
castor, cassava, and poplar are in the same taxonomic 
order (Malpighiales) as flax, and whole-genome assem- 
blies are available for each of these species. Following 
alignment, maximum likelihood phylogenetic trees for 
PMEs (Figure 9) and PMEIs (Figure 10) were constructed. 
Based on the groups defined by Louvet et al [41] for 
Arabidopsis PMEs, the branch length, and the bootstrap 
values (Additional file 4: Figure S2), six major monophy- 
letic groups of PME could be defined (A, B, C, D, E, and 
F which correspond to groups 3, 1, 2, 1, 4, and 1 re- 
spectively in Louvet et al [41]). Group A included five 
LuPMEs; three of them were Type-1 PME, and none 
of them had a cleavage recognition site. The PMEs in 
all the organisms of group B were Type-1 PMEs, and 
all the LuPMEs in this group had a cleavage recognition 
site, and a transmembrane domain, but no predicted 
signal peptide. Group C was composed of Type-1 and 
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Figure 9 Maximum lil<elihood dendrogram of PMEs in flax and related species. The main groups (labeled A-F) and some subgroups are 
shown. The previously reported LuPMEs and the homologous LuPMEs to PttPMEl and AtPME35 are labeled. Red: Linum usitatissimum; Purple: 
Manihot esculenta; Blue: Ricinus communis; Orange: Populus trichocarpa; Green: Arabidopsis thaliana. 100 bootstraps and 2 search-replicates 
(bootstrap values shown in Additional file 4: Figure S2). 

V J 



Type-2 PMEs. All of the Type-1 PMEs had a cleavage 
recognition site, and none of the Type-2 PMEs had either 
a signal peptide or transmembrane domain. Group D con- 
tained the previously described LuPMEl, LuPMES, and 
LuPMES, we did not find any characteristic defining this 
group based on the parameters we described above (e.g. 
Table 1). The PMEs of all the species in group E were 
Type-2 PMEs. Finally group F contained PMEs of all the 
species analyzed except flax. 

In the PMEI phylogenetic tree (Figure 10), groups 
were distinguished by very low bootstrap values in the 
base nodes (Additional file 5: Figure S3), making sub- 
classification of PMEIs ambiguous. Furthermore, we did 
not find any common sequence features that distinguished 
subgroups of PMEIs from each other. 

Phylogenetic analysis of LuPMEs and LuPMEIs grouped 
43 pairs of LuPMEs (out of 105 genes in total) and 39 
pairs of LuPMEIs (out of 95 genes in total) at the terminal 
nodes of the tree (Figures 9 and 10). The remaining genes, 
19 PMEs and 17 PMEIs, did not have obvious paralogs. 
The pairs of genes were confirmed by reciprocal BLASTn 



and BLASTp to test if they were the best BLAST match. 
38/43 pair of LuPMEs and 38/39 pairs of LuPMEIs were 
found to be the best match to one another. This suggests 
these probably originated from a recent whole genome 
duplication event believed to have occurred 5 to 9 MYA 
[15]. Indeed, for the LuPMEIs, the estimated time of diver- 
gence of presumptive paralogs was calculated to be 4.5-8.4 
MYA, and for the LuPMEs estimate was 6.4 -11.9 MYA 
(Additional file 6: Table S2). Lineage-specific expansion of 
groups of PMEs or PMEIs may indicate that selection had 
occurred for particular functions in flax. Expansion of at 
least three sub-groups (CI, D2, and E3) of PMEs was 
evident in the ML tree (Figure 9). 

We identified PMEs genes that have been associated 
with stem development in previous studies in other species, 
and found their presumptive homologs in flax. Siedlecka 
et al [42] found that when the transcript abundance of 
PttPMEl (accession no. AJ277547) in hybrid Aspen {Popu- 
lus tremula x tremuloides) increased, the fiber elongation 
decreased, and conversely, when the transcript abundance 
of the gene was low it stimulated fiber elongation. The 



Pinzon-Latorre and Deyholos BMC Genomics 2013, 14:742 
http://www.bionnedcentral.conn/1471 -21 64/1 4/742 



Page 21 of 25 




closest PMEs in P. trichocarpa for PttPMEl are Ptl 
(POPTR_0001sl625ai) and Pt75 (POPTR_0214s0020ai). 
Based on the phylogenetic tree, we found three LuPMEs 
that were closely related to Ptl and Pt75; they were 
LuPME7, LuPME92, and LuPMES (Figure 9; Table 1), 
which were all type-1 PME, as PttPMEl. The study of 
these genes will be important in future studies as they 
may regulate fiber elongation the same way as in poplar, 
as the fibers of both plants elongate intrusively. Hongo et al 
[43] found that the type-1 PME AtPME35 (At3g59010), 
has a role in strengthening the inflorescence stem of 
Arabidopsis, by mediating the demethylesterification of 
the primary cell wall of cortical cells and interfascicular 
fibers, this gene was suggested to have a blockwise 
demethylesterification action. We inferred that the com- 
mon ancestor of LuPME61 and LuPMEl is the likely 
ortholog to AtPME35 (Figure 9; Table 4). Both LuPME61 
and LuPMEl have basic pis (9.42 and 8.82, respectively), 
similar to the pi of the mature protein of AtPME35 (pi 
8.70), so it is possible that they also have a blockwise 
demethylesterification activity similar to AtPME35, which 



leads to stiffening of the cell wall. The study of loss-of- 
function mutants for these genes in flax might identify 
informative phenotypes related to stem development. 

Conserved residues in PMEs 

We searched the predicted LuPMEs for conserved amino 
acid residues previously reported to be important for PME 
function. These amino acids were identified in a Type-1 
PME from tomato {Solarium lycopersicum), (PMEl_SOLLC, 
SwissProt P14280), and the positions listed here refer to 
that sequence [7]. Three residues are proposed to be 
catalytic residues: D132, D153, and R221. Two residues, 
Q109 and Q131, are believed to stabilize the intermediate 
that is formed after nucleophilic attack on the carboxyl- 
methyl group. Finally, six aromatic amino acids at con- 
served positions are required for substrate binding (F80, 
Y135, F156, Y218, W223, and W248), and of these F80, 
Y135, and W223 are possible targets of the PMEIs. We 
searched the predicted LuPMEs for all eleven of the 
residues that have been proposed to be critical to PME 
function (Tables 1 and 2). The most highly conserved 
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residues were the catalytic residues: D132, D153 , and 
R221, which could be identified in (91/105), (96/105), 
and (97/105) of the predicted LuPMEs, respectively 
(Figure 11 A). The aromatic residues responsible for sub- 
strate binding were also highly conserved, where any 
aromatic residue is considered to be a conserved residue 
in comparison to the substrate binding positions defined 
in PMEl_SOLLC (i.e. F80, Y135, F156, Y218, W223, and 
W248) . The most highly conserved of these were W223 
(93/105) and W248 (92/105), and the conservation of 
other substrate binding residues was also high: F80 
(74/105), Y135 (77/105), F156 (89/105), and Y218 (89/105). 
Among these aromatic residues there were three posi- 
tions (F156, W223 and W248) in which the identity 
(and not merely similarity) of the aromatic amino 
acids was also highly conserved: at F156, F was found 
in 88/89 of the LuPMEs that had an aromatic residues 
at that position; at W223, 90/93 aromatic residues 
were W, and in W248 91/92 aromatic residues were 
W. F80, Y135, and W223 are responsible of substrate 
binding of the PMEs and also interact with the PMEI. 
F80 generated the highest number of contacts (17 in 
total) with the PMEI [7]. Out of all the aromatic resi- 
dues, position F80 was the least conserved, followed 
by Y135, meanwhile W223 showed the highest conser- 
vation. This might imply that F80 and Y135 are not 
fundamental for binding to the substrate, as it might 
be W223, and on the other hand, lacking these resi- 
dues may limit inhibition by the PMEIs. 

Conserved residues PMEI 

PMEIs have four antiparalell alpha-helices (al, a2, a3 
and a4) arranged in an up and down topology, and three 



short alpha-helices (aa, ab, and ac) at the N-terminus 
[7]. Two groups have studied PMEI structure in detail. 
Di Matteo et al [7] identified residues with important 
roles in the structure and activity of a PMEI from kiwi 
{Actinidia deliciosa; PMEI_ACTDE, SwissProt accession 
P83326), focusing on the interaction with a PME. They 
identified cysteine residues that generated disulfide brid- 
ges conecting helices aa and ab (C9 and C18), and helices 
a2 and a3 (C74 and CI 14). Furthermore, PMEI residues 
T73, E76, N77, T113, and D116 were found to allow 
interaction of the PMEI with three of the aromatic resi- 
dues of the PME (F80, Y135, and W223). An acidic 
patch was formed by three conserved residues on both 
the a2 helix (E76, D80, and D83) and on the a3 helix 
(D96, D109, and D116). Finally, salt bridges occurred 
between the PMEI residues D116 and E76 and PME 
residues K224 and R81. Hothorn et al [6] also studied 
the important residues for the PMEI activity and structure, 
using Arabidopsis PMEI-1 (PMEI1_ARATH, SwissProt 
accession number Q9LNF2) as a model. They identified 
a disulfide bridge connecting helices 5 and 6 (a2 and a3 
in Di Matteo et al, [7]), formed by residues C71 and 
cm. They also identified a residue responsible for the 
N-terminal orientation (P28) that is located between 
the three N-terminal a-helices and the four a-helices 
towards the C-terminal; and several residues contribut- 
ing to the bundle-hairpin interface. 

We searched the predicted PMEIs of flax for the critical 
residues identified by Di Matteo et al [7] and Hothorn 
et al [6]. Our results (Table 3) are presented using as 
reference positions the mature PMEI_ACTDE protein. 
The residues in LuPMEIs with the highest conservation 
were C74 and CI 14 (both 94/95), which generate a 
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Figure 11 Tertiary structures siiowing tiie conserved residues analyzed in relation to the reported structures. Structures shown are for 
PME1_S0LLC (A) and PMELACTDE (B), PDB: 1XG2. For the residues involved in bundle-hairpin interface in PMELACTDE only those residues with 
conservation higher than 80% are labeled in the structure. 
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disulfide bridge (Figure IIB). The conservation of the 
other two cysteines, C9 and C18, which stabiUze the 
protein by hydrophobic interactions, was slightly lower, 
(86/95) and (88/95), respectively. The conservation (i.e. 
similarity, not necessarily identity) in the polar PMEI 
residues that interact with the aromatic PME residues, 
F80, Y135, and W223, was higher than 80% in all the 
cases: T73 (78/95), E76 (89/95), N77 (78/95), T113 (78/95), 
and D116 (85/95). 87.4% (83/95) of the PMEIs had con- 
servation of at least 3 out of the 5 polar residues. On the 
other hand, the Aspartic acid (D), and Glutamic acid (E) 
residues that are predicted to be important in the gener- 
ation of an acidic patch on alpha helices 2 and 3, had a 
low conservation (i.e. similarity, not necessarily iden- 
tity), 47.5% on average. The conservation of the residues 
contributing to the bundle-hairpin interface was also 
low. Out of the 12 non-polar residues analyzed, only five 
were conserved (i.e. similarity, not necessarily identity) 
in more than 80% of the LuPMEIs, those are 15 (82/95), 
L22 (89/95), L36 (92/95), L138 (84/95), and 1145 (89/95), 
the rest (18, L17, A21, L33, 1141, V144, and L148) were 
below 70% of conservation. 

Gene expression and conserved residues 

We tested whether there was any correlation between the 
transcript expression evidence we obtained and the pres- 
ence of the critical residues described above (Tables 1, 2 
and 3). In general, genes that are not expressed may ac- 
cumulate mutations, including mutations in residues 
critical to the normal function of the protein. We found 
that when we analyzed the 33 critical residues in the 
LuPMEIs as a group, the conservation of these residues 
among the expressed genes (73.2%) was significantly 
higher than in the genes for which transcript expression 
was not detected (63.9%) (Fisher s Exact Test, p < 0.05). 
Taking together the 11 critical residues studied in the 
LuPMEs, we found that in the expressed genes, the level 
of conservation (86.8%) was significantly higher than 
the conservation observed in the non-expressed genes 
(75.3%) (Fisher s Exact Test, p < 0.05). However when 
we individually analyzed the residues we found that in 
the LuPMEs only three out of the 11 residues (Q109, 
F156, and R221) showed significantly higher conserva- 
tion in the expressed genes when compared to the genes 
without evidence of transcript expression (Fisher s Exact 
Test, p < 0.05). In the remaining eight there was no sig- 
nificant difference (Fisher s Exact Test, p > 0.05). The 
critical residue that showed the greatest change in con- 
servation in relation to transcript expression was Q109, 
which was found in 83.1% of expressed LuPMEs, but 
only 57.1% of LuPMEs with no evidence of transcript 
expression. In the same way, we individually analyzed 
the LuPMEIs conserved residues, and found that only 5 
of the 33 residues (L36, V144, C18, R27, and D83 ) have 



a significant higher conservation in expressed genes respect 
to non-expressed genes (Fishers Exact Test, p < 0.05). In 
the remaining 28 residues there was no significant dif- 
ference (Fisher s Exact Test, p > 0.05). The two greatest 
differences were observed in two of the residues con- 
tributing to the bundle-hairpin interface: V144 was con- 
served in 71.1% of expressed LuPMEs, but only 33.3% of 
non expressed genes, and R27 from 59.0% to 16.7%. 
Interestingly, the change in conservation was very dif- 
ferent between the two pairs of cysteines of the PMEIs, 
which generate the disulfide bridges, C74 and CI 14 
conservation changed both from 98.8% in expressed 
genes, to 100% in both, in non-expressed genes. Con- 
versely, the conservation at positions C9 and CI 8 was 
reduced drastically, although not significantly in C9, 
from 92.8% and 96.4% in expressed genes, to 75.0% 
and 66.7%, respectevely in non expressed genes. This 
might indicate that there is more evolutionary pres- 
sure on residues C74 and CI 14, which indicates that 
they might be more important for the structure of the 
protein. 

The residue at position K31 (respect to PMEI_ACTDE 
[7]) or P28 (respect to PMEI1_ARATH [6]) has been 
reported to affect the orientation of the N-terminal of 
the PMEIs. We found 13 different amino acids at this 
position. There were three prevalent residues in this 
position, they were P (19/95), A (18/95), and I (17/95) 
residues. Hothorn et al [6], found that when they mu- 
tated P28 to Ala in PMEI1_ARATH, the inhibitory ac- 
tivity of the protein diminished. So the 18 PMEIs with 
alanine could have a decline in activity. 

Conclusions 

PMEs, regulated in part by PMEIs, modify cell and tissue 
properties by demethylesterification of pectins within 
cell walls and the middle lamella. Flax phloem fibers 
elongate intrusively by penetrating the middle lamella of 
the cells below and above them. This process requires 
the loosening of the middle lamella, the strengthening 
of the cell wall of the growing fiber, and then the cre- 
ation of a new contact interface with the cell. In these 
processes, PMEs might be involved. Here we described 
105 putative PMEs and 95 putative PMEIs in the flax 
genome, of which 77 and 83 genes, respectively, had evi- 
dence of transcript expression. The proportion of PMEs 
and PMEIs in the flax genome as a proportion of all 
predicted proteins was similar to most other dicots 
(Additional file 3: Figure SI). We defined a list of can- 
didate genes that could play a role in fiber development 
(Table 4), their specific transcript expression in fiber con- 
taining tissues and mutant of these genes should be ana- 
lyzed in detailed in future studies, and these genes area 
target for manipulation through reverse genetics. 
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Additional files 



Additional file 1: Table SI. List of primers ar^d TaqMan probes used 
for qRT-PCR. 

Additional file 2: FASTA formatted LuPME and LuPMEl sequences. 

Additional file 3: Figure SI. Percentage of PMEs ar^d PMEIs, respect to 
the total number of proteins, in Embryophyta plants with available full 
genomes in Phytozome (version 9.1). 

Additional file 4: Figure S2. Maximum likelihood tree of PMEs in Flax 
and related species. The main groups, and important subgroup are 
shown. The homologous LuPMEs to PttPMEl and AtPME35 are shown 
with an arrow. Red; Linum usitatissimum; Purple (Me): Manihot esculenta; 
Blue (Rc): Ricinus communis; Orange (Pt): Populus trichocarpo; Green: 
Arabidopsis thaliana. 100 bootstraps and 2 search-replicates. 

Additional file 5: Figure S3. Maximum likelihood tree of PMEIs in flax 
and related species. The main groups, and subgroup Dl are shown. Red: 
Linum usitatissimum; Purple (Me): Maniliot esculenta; Blue (Rc): Ricinus 
communis; Orange (Pt): Popuius tricinocarpa; Green: Arabidopsis tlialiana. 
100 bootstraps and 2 search-replicates. 

Additional file 6 Table S2. Genetic distance between possible paralogs 
in the LuPMEs and LuPMEIs gene families. The Kimura 2-parameter model 
was used to calculate the genetic distance, which was used to calculate 
the divergence time using t = K/2r, where t is time, K is the genetic 
distance, and r is the substitution rate, either 1.5 x 10"^ or 8.1 x 10"^. 
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